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Abstract 

Motivated by the question of how macromolecules assemble, the notion of an 
assembly tree of a graph is introduced. Given a graph G, the paper is concerned 
with enumerating the number of assembly trees of G, a problem that applies to 
the macromolecular assembly problem. Explicit formulas or generating functions 
are provided for the number of assembly trees of several families of graphs, in par- 
ticular for what we call (H, 0)-graphs. In some natural special cases, we apply 
powerful recent results of Zeilberger and Apagodu on multivariate generating func- 
tions, and results of Wimp and Zeilberger, to deduce recurrence relations and very 
precise asymptotic formulas for the number of assembly trees of the complete bi- 
partite graphs K Ujn and the complete tripartite graphs K n ^ ritn . Future directions 
for reseach, as well as open questions, are suggested. 

1 Introduction 

Although the context of this paper is graph theory, the concept of an assembly tree 
originated in an attempt to understand macromolecular assembly [3]. The capsid of 
a virus - the shell that protects the genomic material - self-assembles spontaneously, 
rapidly and quite accurately in the host cell. Although the structure of the capsid 
is fairly well known, the assembly process by which hundreds of subunits (monomers) 
interact to form the capsid is not well understood. In many cases, the capsid can be 
modeled by a polyhedron, the facets representing the monomers. The assembly of the 
capsid can be modeled by a rooted tree, the leaves representing the facets, the root 
the completed polyhedron, and the internal nodes intermediate subassemblies. The 
enumeration of such trees plays a central role in understanding how symmetry effects 
the assembly process [3]. 

All graphs in this paper are simple. Let G = (V, E) be a connected graph of order n 
with vertex set V and edge set E. In the definition of an assembly tree T for the graph 
G, each node of T is labeled by a subset of V . No distinction will be made between the 
node and its label. For a node U in a rooted tree, c(U) denotes the set of children of U . 

Definition 1. An assembly tree for a connected graph G on n vertices is a rooted tree, 
each node of which is labeled by a subset U C V such that 

1. each internal (non-leaf) node has at least two children, 



2. there are n leaves with labels {v}, v £V, 

3. the label on the root is V, 

4. U = (J c(U) for for each internal node U. 

An assembly tree T for G describes a process by which G assembles. At the beginning are 
the individual vertices of G - the leaves of T. Each internal node XJ of T represents the 
subgraph of G induced by the subset U of vertices. Each internal node also represents the 
stage in the assembly process by which subgraphs of G join to form a larger subgraph; 
more precisely, the subgraphs induced by the children of U join to form the subgraph 
induced by U. The process terminates at the root of T - representing the entire graph 
G. Call two assembly trees T\ and T2 for a graph G equal if there is a label preserving 
graph isomorphism between T\ and T%. 

There are numerous ways, some mentioned in the last section, to further restrict 
how the assembly process occurs. In this paper we will assume that, at each stage, two 
subgraphs can be joined if and only if there is an edge that connects them. 

Definition 2. An assembly tree for a connected graph G = (V, E) using the edge gluing 
rule is an assembly tree for G satisfying the additional property: 

5. Each internal node has exactly two children, and if U\ and U2 are the children 
of internal node U, then there is an edge {^1,^2} £ E, the gluing edge, such that 
v\ G Ui and V2 € U2. 

Figure [T] shows a graph G and two assembly trees for G using the edge gluing rule. 
Throughout this paper, until the last section, all assembly trees use the edge gluing 
rule. Therefore, the term "assembly tree" will refer to an assembly tree using the edge 
gluing rule. 
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Figure 1: Two assembly trees for the graph G. 

The subject of the paper is, given a graph G, to enumerate the number 

a(G) 

of assembly trees of G. Gluing sequences are defined in Section [2] and are used too enu- 
merate the number of assembly trees for paths, cycles and certain star graphs. The con- 
cept of an i?-graph is defined in Section [3j the complete multi-partite graphs are special 
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cases. A generating function formula for the number of assembly trees for any H- graph 
is provided in Section [3j Section [4] considers three specific examples of -ff-graphs which 
lead to frequently encountered families of graphs, such as complete bipartite graphs or 
complete tripartite graphs. For each of these examples, the relevant multivariate gener- 
ating function is computed, then the diagonal of that generating function is introduced 
and studied. Very strong recent results of Doron Zeilberger and Moa Apagodu enable 
us to prove polynomial recurrence relations for the coefficients of these diagonals, while 
results of Zeilberger and Jet Wimp allow us find the growth rate of these coefficients at 
an arbitrary level of precision. In particular, we obtain the growth rates for the num- 
ber of assembly trees of the complete bipartite graphs K n>n and the complete tripartite 
graphs -ftT n ,n,n as a function of n. Open questions and further research directions are 
offered in Sectional 

2 Paths, Cycles and Stars 

An assembly tree for a graph G = (V, E) of order n, as defined in the introduction, is a 
binary tree with n leaves and n — 1 internal nodes. To each internal node U there is a 
corresponding gluing edge as in Definition [2] (not necessarily unique) , which we denote 
by eu S E. 

Lemma 1. If T is an assembly tree for a connected graph G = (V,E), then the set of 
gluing edges {ejj \ U is an internal node of assembly tree T} is a spanning tree of G. 

Proof. If the set S := {ejj \ U is an internal node of assembly tree T} of gluing edges 
is not spanning, then the root of T would not be V. If S contains a cycle, then T 
would have a node with just one child. If S is not connected, then G would not be 
connected. □ 

If S C E{G) is any spanning tree of a connected graph G, then any linear ordering 
e\, e%, . . . , e n _i of the edges in S induces an assembly tree for G as follows. Build the 
tree T from the bottom up. The leaves are the singleton vertices of G. Assume that we 
have proceeded through the sequences of edges from e\ to et-i- For = {tti, W2} add 
a node to T whose two children are the already constructed nodes U% and U2 such that 
u\ G U\ and ui G Ui- Call an ordering ei, e^, ■ ■ ■ , e n -\ of the edges of the spanning tree 
a gluing sequence. The elements of a gluing sequence for G are the gluing edges of the 
corresponding assembly tree T. 

Example 3. Consider the 4-cycle C4 in Figure [2} This example shows that two different 
spanning trees can induce the same assembly tree: the gluing sequences (ei,e2,es) 
and (ei,e2,e4) produce the same assembly tree. Moreover, two different orderings, for 
example (ei,e2,e3) and (e2,ei,es), of the same spanning tree can produce the same 
assembly tree. 

For the star S n := K\ >n , the spanning tree is S n itself, and each gluing sequence 
produces a distinct assembly tree. This leads immediately to the following result. 

Proposition 1. For the star the number of assembly trees is a(S n ) = n\. 

Consider the star S% with n arms such that each arm has length 2, as in Figure 3. 



3 



Figure 2: The cycle C4. 



Theorem 4. 

/ n\ (2n — k)l 



a{sD = y: 



k=0 



k J 2 n ~ k 



Proof. Suppose that k of the e-edges come first in the gluing sequence. In Figure [2] 
we refer to the e-edges and the w-edges. There are (^) ways to choose these edges, 
and the order does not matter for the assembly tree. There are 2n — k edges that 
remain. For convenience label them ei, e2, . . . , e n -k and w\, W2, ■ ■ ■ , w n . These 2n — k 
edges can be placed in any order in the gluing sequence as long as the edge u>j comes 
after ej for i = 1, 2, . . . , n — k. Each such order determines a distinct assembly tree. To 
determine the number of such permutations, first choose k positions for w n _k+i, ■ ■ ■ ; w n . 
There are ( 2 ™ fc ) ways to do this, and for each such choice the edges w n —k+±, ■ ■ ■ ,w n 
can be permuted in k\ ways. The remaining 2(n — k) positions are to be filled by 
the edges e%, e2, . . . , e. n -k and w\, W2, ■ ■ ■ , w n so that the edge Wi comes after ei for 
i = 1, 2, . . . , n — k. The number of ways to do this equals the number of permutations 
of 2(n — k) objects where there are 2 objects of type 1, 2 objects of type 2 . . . , 2 objects 
of type n — k. This is equal to <y2l ^~^' . So, with k of the e-edges coming first in the 
gluing sequence there are 

n\ (2n — k\ ^(2n — k)\ fn\ (2n — k)\ 



k \ k ' 2 n - k \k 2 n ~ k 
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assembly trees. Summing over all possible values of k from k = to k = n gives the 
formula in the statement of the theorem. □ 



Theorem 5. If P n is the path and C n is the cycle on n vertices, then 

l/2n-2\ l/2n-2\ 
a{P n ) = - , a{C n ) = - . 

n \ n — 1 / 2 \ n — 1 / 

Proof. For the path, the unique spanning tree S consists of all the edges of the path. 
We proceed by induction. First consider the number of assembly trees in the case that 
e G S is the last edge in the gluing sequence. If the removal of e from G results in 
subgraphs of orders k and n — k, then the number of assembly trees such that e is 
the last edge in the gluing sequence is a{Pk) a{P n -k) ■ If T and T' are assembly trees 
coming from gluing sequences with distinct last elements, then T / T' . Therefore 
a(P n ) = J2k=i a {Pk) a(Pn-k), which is a well know recurrence for the Catalan numbers. 

Concerning the cycle, there are n spanning trees of C n . Given any one of these span- 
ning trees, by the result above for the path, there are ^ ( ) corresponding assembly 

trees for C n , hence a total of n-^( 2 ™Zi) = Cn-i) assembly trees. But each of these is 
counted twice for following reason. The assembly tree corresponding to a sequence of 
edges in a spanning tree for which e is the last edge in the gluing sequence and / is the 
edge of G not in the gluing sequence is equal to the assembly tree for which / is the last 
edge in the gluing sequence and e is the edge of G not in the gluing sequence. □ 

3 i7-graphs 

Let H be a connected graph with vertex set [N] := {1,2,..., N}, and let (ft : [N] — > 
{0,1} be a labeling of the vertices of H. For any sequence (ni, n,2, ■ ■ ■ , njv) of non- 
negative integers, define a graph G^,<i>)( n ii n 2, ■ ■ ■ ,tin) as follows. The vertex set is 

V{G( H ^{m,n 2 , ■ ■ ■ ,n N )) := : i G [N], \ <j< m} 

and (i,j) is adjacent to (i',f) if and only if 

i = i and (f>{i) = 1, or 
and {»,«'} G E(H). 

This is equivalent to saying that the graph G^ H ^(ni,ri2, ■ ■ ■ , njv) is obtained by replac- 
ing each vertex i of H by a complete graph of order nj or its complement (n, isolated 
vertices) , and by replacing each edge of H by all possible edges between the graphs that 
replace the two end vertices of that edge. Call a graph an (H, 0)— graph if it is of the 
form Gm :( j > \(ni,n2, ■ ■ ■ , n^) for some choice of the parameters {ni, . . . , njy}. 

The following notation will be used in this section 

1. n := (ni,n 2 , . . ■ ,n N ) 

2. n > k if and only if rij > ki for all i 

3. x := (xi,x 2 , ■ ■ -,x N ) 
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4. n! = ni!ri2! • • • n^l 

c „n _ ni n 2 . . . N n 

U. -A. JL' ^ 2 TV 

7- a(if,</>)(n) = a(G f ( i j !</> )(ni,n 2 ,...,nj V )) 

8- ^W)(x) = E„>o a(W n ) £r- 

The last entry in the above list is the exponential generating function for the number of 
assembly trees of an (H, c/>)-graph. The zero vector is denoted 0. 

Theorem 6. The exponential generating function for a connected {H,4>)- 
^W)(x) = 1 - 



IS 



N 



1=1 



<p(i)=o 



E 



{i,ME(H) 



Proof. Let be the all zeros vector; let e^ be the vector with each coordinate except 
the i th coordinate 1; and let e^j be the vector with each coordinate except the i th and 
j th coordinate 1. Note that 

a(H,<p){0) = 
«(H,0)(ei) = 1 for alii 
a W) (2ei) = if0(i) = O 

'l i£{i,j}€E(H) 
ii{i,j}$E(H) 



a(H,4,){ei,j) 



The following recurrence holds for all n except those of the form ej, e^j when ^ 
E(H) and 2ej when $>(i) = 0. To simplify notation, denote a^^(n) by a(n) and 
by A(x). 



o(n) 



E 

0<k<n 



n 



a(k)a(n — k). 



The recurrence above is obtained by considering the two subtrees T\ and T2, rooted at 
each of the children of the root of an assembly tree. The tree T\ is itself an assembly 
tree of a graph of the form G(H,(f>)(ki, A; 2 , . . . , /cat) where fej < rij for all i, and T2 is an 
assembly tree of a graph of the form G^H,^){ n i — «2 — fo, . . . , — fcjy). Now 



A(x) = J] a(n) ^ 



n>0 



E 

n>0 



E (k)«*Wn-k) £+f><- E 

<k<n V 7 I ' i=l 0(i)=O 



at 



0<k<n 

N 



{i,j}mH) 



1=1 



<Hi)=o 
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The added terms YU=i Xi ~ Yl</>(i)=o ~t ~ j}<£E(H) x * x i correct for the three cases for 
which the recurrence does not hold. Solving for A(n) by the quadratic formula yields 
the generating function in the statement of the theorem. □ 

Some special cases follow as corollaries. For example, if H is just a single vertex 
v and (j)(v) = 1, then G(u^(n) is the complete graph. Therefore, by Theorem [6] the 
generating function for the complete graph is 1 — \/l — 2x, which when expanded gives 
the following result. 

Corollary 1. If K n is the complete graph, then 

(2n-2)! 



a(K n 



2"" 1 (n- 1)!' 



Corollary 2. // Kf ni>n2j ... jnN ) is the complete multipartite graph, then its exponential 
generating function is 



A(x) 



N 



l_;V + £(i 



1=1 



In particular, the generating function for the number of assembly trees of the complete 
bipartite graph K nitfl2 is 

A(x,y) = 1 - ^(1 - x) 2 + (1 - y) 2 ~ 1- 

Proof. If H is K n and <p is identically then Gr H ^(ni,n2, . . . ,njv) is the complete 
bipartite graph. □ 

Expanding for the generating function A(x, y) counting assembly trees of complete 
bipartite graphs using Maple we arrive at the exponential generating function 



A(x, y) = y + x + xy + xy 2 + x 2 y + xy 3 + x 3 y + (5/2)x 2 y 2 + (9/2)x 2 y 3 + x 4 y 

+ (9/2)xV + xy* + 7x 2 y 4 + 7x 4 y 2 + (25/2)x 3 y 3 + 10x 2 y 5 + (55/2)xV 
+ (55/2)xV + 10x 5 y 2 + (27/2)xV + (645/8)xV + (105/2)x 3 y 5 
+ (27/2)xV + (105/2)xV + yx 5 + y 5 x + yx 6 + y 6 x + yx 7 + y 7 x + yx 8 
+ (35/2)yV + 91y 3 x 6 + (1575/8)y 4 x 5 + (1575/8)y 5 x 4 + 91y 6 x 3 
+ (35/2)yV + y 8 x--- , 

which gives the following table for the number of assembly trees. 

1 2 6 24 
10 54 336 
450 3960 
46400 

The diagonal elements 1,10,450,23200,... are the number of assembly trees for 
Kill -^2,2) ^3,3) -f^4,4) . . . , a sequence which does not match anything in the Online En- 
cyclopedia of Integer Sequences [8]. We will return to this sequence in the next section, 
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where we will find the asymptotic growth rate of the sequence, and to find a polynomial 
recurrence relation satisfied by the sequence. 

Consider the set of all (H, (/>)-graphs on n labeled vertices. In other words, such a 
graph is obtained by choosing, say n\ of the n vertices to correspond to vertex 1 of 
H, n,2 of the n vertices to correspond to vertex 2 of H, ... , un of the n vertices to 
correspond to vertex N of H. Let (n) denote the total number of assembly trees 

of all the possible (H, (^)-graphs on n labeled vertices, and B^ H< ^{n) the corresponding 
exponential generating function 



n=0 



n 



! ' 



Corollary 3. The exponential generating function of the number or assembly trees of 
all connected (H,<f))- graphs of order n such that the number of vertices in H is N , the 
number of edges in H is M , and the number of i such that 4>(i) = is J is 



B m) (n) = 1 - Jl - 2Nx + ~ 2M + jj 



x 2 . 



Proof. We have 

oo 



1=0 



n 



ni 



E E 

n=0 Lyii+niH \-riN=n 

oo 

E E 

n=o L«i+«i+-- 



n>0 



.fil+niH \-riff=n 



71 



nin 2 • • - njv 



«(g,0)( n ) 
n\\u2\ ■ ■ ■ njy! 



X" 



X 



niH \-n N 



n>0 



n! 



A (H,<I>)(X,X,...,X) = 1 



N 



l-2^x+ x2 + 2 Yl 



x c 



8=1 



(0=0 



{ij}^(H) 



1 - ,/ 1 - 2N.r + ( 2( ] - 2.U + J j -r 2 . 



where the second-to- last equality follows from Theorem [6j 



□ 



4 Examples 

In this section, we consider a few interesting examples. In these examples, the graph H 
is the basis for the construction of an i^-graph will be very small (two or three vertices) , 
and ni = n2 or n\ = n 2 = will hold, resulting in graphs Grjj,<f>) with two or three 
classes of vertices in some obvious sense. 
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4.1 Theoretical Background 
4.1.1 Power Series in One Variable 

Let C[n] denote the ring of all polynomials in one variable over the field of complex num- 
bers, and let C[[x]] denote the ring of all formal power series with complex coefficients. 
In what follows, we present a few important definitions and theorems on one- variable 
power series. The interested reader can consult Chapter 6 of [9] for a deeper introduction 
to the topic, including the proofs of the theorems we include here. 

Definition 7. A sequence /(0), /(l), • • • of complex numbers is called polynomially 
recursive, or p-recursive if there exist polynomials Pq, Pi, ■ ■ ■ , Pf- G C[n], with P k ^ so 
that 

P k (n + k)f(n + k) + P fc _!(n + k- l)/(n + k - 1) + • • • + P (n)f(n) = (1) 
for all natural numbers n. 

Definition 8. We say that the power series u{x) G C[[x]] is d-finite if there exists a 
positive integer d and polynomials po(n),pi(n), • • • ,Pd{n) so that pd / and 

p d (x)u {d) (x) + p d ^i{x)u {d ~ 1] {x) H + pi{x)u {x) + p (x)u{x) = 0, (2) 

Here vP> = pi. 

Theorem 9. The sequence /(0), /(l), • • • is p-recursive if and only if its ordinary gen- 
erating function 

oo 

u(x) = J2f(n)x n (3) 

n=0 

is d- finite. 

Definition 10. The formal power series / G C[[t]] is called algebraic if there exist 
polynomials Pq(x), Pi(x), • • • , Pd(x) G C[x] that are not all equal to zero so that 

P (x) + Pi(x)/(x) + • • • + P d (x)f d (x) = 0. (4) 

The smallest positive d for which such polynomials exist is called the degree of /. 

Theorem 11. If f G C[[x]] is algebraic, then it is d- finite. 



We point out that the converse of Theorem 11 is not true. For instance, f(x) = 
Sn>i IT = m (V( i — x )) 1S d-finite, but not algebraic, as we will soon see. 

One way to prove that a power series is not algebraic is by proving that it is not 
d-finite. Another way of proving that a power series is not algebraic is by showing that 
is does not have the "right" growth rate. The following theorem of Jungen is a powerful 
tool in doing so. 

Theorem 12. 171/ Let f(x) = ^n>o a nX n G C[[x]] be an algebraic power series, and let 
us assume that a n ~ cn r a n , where c and a are non-zero complex constants, and r is a 
negative real constant. 

Then r = s + \, for some negative integer s. 

In particular, selecting c = 1, r = — 1 and a = 1, we see that f(x) = Y^ n >l IT 1S no ^ 
algebraic. 



9 



4.1.2 Power Series in Several Variables 

Now we consider formal power series in several variables. For a deeper introduction to 
the topic, including the proofs of the theorems we present, see [6]. Let C[[xi, x 2 , • • • , 
denote the algebra of all formal power series in variables x\, x 2 , ■ ■ ■ ,x k over the field of 
complex numbers. 

Definition 13. Let /(ni, n 2 , • • • , n k ) : N k — > C be a function, and let F(x\, X2, • • • , x k ) = 
£ m ,n 2 ,-,n fc /( n i>' 7 '2,--- ,n k )x[ x x n 2 2 ■ ■ ■ x\ k £ C[[x 1 ,x 2 ,--- ,x k ]]. 
We say that F is (i-finite if all the derivatives 

/ jr\ di ( jr\ d2 rjr\ dk F 

\dxij \dx 2 ) \dx k ) 

for di > lay in a finite dimensional vector space over the field of rational functions 
C(xt,x 2 , ■ ■ ■ ,x k ). 

Theorem 14. Let F G C[[xi, x 2 , • • • , x k ]]. If F is algebraic, then it is d- finite. 

The notion of the diagonal of a multivariate power series F is a natural one in that 
it enables us to focus on the coefficients of F that are often the most interesting for 
practical purposes. 

Definition 15. Let f(ni,n 2 , • • • , n k ) : N fc — > C be a function, and let F(xi,x 2 , • • • , x k ) = 
n 2 ■■■ n k fi n ii n 2, • • • , nkjx^x^ 2 ■ ■ ■ x^ k . Then the diagonal of the multivariate power 
series F(xi,x 2 , • • • , x k ) is the univariate power series 

diagF(x) = ^ f(n, n, • • • , n)x n . 

n 

Example 16. Let F(s,t) = j^-^ = E m >o( s + t ) m - Then for ever y n £ N, the 
coefficient of s n t n in F(s,t) is equal to ( Therefore, 

Theorem 17. Let F G C[[xi, x 2 , ■ ■ ■ ,Xk]]- If F is d- finite, then diagF(x) is also d- finite. 
4.2 Three families of graphs 

Our first example of computing the diagonal of a power series A^ H ^(z) is included 
because of the precise nature of the answer that we are able to compute. 

Example 18. Let H be a graph on vertex set {u, v}, with cp(u) = and 4>(v) = 1, and 
with one edge, the edge uv. The graphs GrjjM are the graphs consisting of a subgraph 
G consisting of m independent vertices and a complete subgraph G" on n 2 vertices so 
that G' and G" are vertex-disjoint, and each vertex of G' is adjacent to each vertex of 
G" . Then by Theorem |6j we have 

A(HJ)(x,v) = 1 - ^l-2x-2y + y 2 . 
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In particular, the diagonal of A^^(x,y) counts the number of assembly trees of 
such graphs with m = ri2- In order to compute this diagonal, note that 

A (H ,<p)(x, y) = 1 - v / l-2x-2y + y 2 = 1 - (l - (2x + 2y - y 2 )) 1/2 . 
By the Binomial theorem, we know that 

((l-(2x + 2y-y 2 )) 1/2 = ^(-ir( 1/2 )(2x + 2 y -y 2 ) m . 

\TTl J 



m>0 

When computing the mth power of (2x + 2y — y 2 ), let us consider the summand 
(2x) 4 (2y)- ? (— y 2 ) m ~ l ~i . The number of such summands is clearly [ i j r J^i i _-)- Such a 
summand will contain x and y raised to the same exponent n if and only if i = n and 
n = j + (2m — 2n — 2j), that is, when 3n = 2m — j. In particular, (2x + 2y — y 2 ) m will 
contain a constant multiple of x n y n if and only if 1.5n < m < 2n. Therefore, if we denote 
the coefficient of x n y n in 1 - s/l - 2x - 2y + y 1 by 

a n,ni then routine simplifications 

lead to the formulas 

dhgw.) i e (r )c )(,::: ilk- i «•■ w 

n>0 

and 

.( Gw) (n,n,)= £ (rjOU-LJ 4 ™- < 6 > 

m=3n/2 v 7 v y v 7 

We would like to point out that even if we have given an exact formula for a n ^ n , the 
question of how fast the a n ^ n grows is far from being answered. We will discuss that 




question in Section 4.4 Furthermore, A^ H ^(z) is algebraic, so by Theorem [141 it is 



d- finite. Therefore, Theorem 17 implies that diagvl^^-j (z) is d-finite. So, by Theorem 
9l the coefficients of diagAfjj^Tz) must satisfy a polynomial recurrence relation. What 



is that relation? We will return to this question in Section 4.3 



Example 19. Let if be a graph on vertex set {u, v}, having one edge, the edge uv , and 
set (f)(u) = 4>(v) = 0. Then, as we have seen in Corollary [2j we have 

A(x,y) = 1 - V(l " x) 2 + (1 " y) 2 ~ 1 

for the generating function of the number of assembly trees of a complete bipartite 
graph. 

Determining the coefficients of d\a,gA{x, y) is much more difficult than it was for the 
bivariate generating function in Example (18) since directly applying the Binomial the- 
orem would simply lead to formulae that are too complicated to be useful. Nevertheless, 
some powerful techniques recently developed by Doron Zeilberger and Moa Apagodu 
will enable us to determine these numbers. We will do this in Section 14.31 The same 
is true for the case of the complete tripartite graph, which is the subject of the next 
example. 

Example 20. Let H be a graph on vertex set {u, v,w}, having edge set {uv, uw, v w}, 
and set 4>(u) = (f>(v) = (j>{w) = 0. Then, as we have seen in Corollary [2j we have 

A(x, y,z) = l- v / (l-x) 2 + (l-y) 2 + (l-z) 2 -2. 
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4.3 Finding Recurrence Relations 

In this section, our goal is to find polynomial recurrence relations for the one-variable 



generating functions studied in Examples 18, 19 and 20 As we explained in the discus 



sion of Example 18, such recurrence relations exist, since our functions are diagonals of 
algebraic, and hence, d-finite power series in several variables. 

Until recently, the best available technique at this point would have been simply 
guessing. That is, one would have had to assume that the sought polynomial recurrence 
relation does not consist of too many terms, and does not involve polynomials of too 
high degrees, and then have a software package look for a suitable recurrence relation 
within those limits. One major problem with this approach is that even if the software 
package does return a recurrence relation that is satisfied by all available data points, a 
theoretical proof that the obtained recurrence relation is satisfied by all natural numbers 
n is still lacking. 

The breakthrough is achieved by the following theorem of Zeilberger and Apagodu. 
(We present a simplified version of the theorem. The interested reader should consult 
[2] for the full version.) 

Theorem 21. Let 

p 

F(n; xi, x 2 , ■ ■ ■ ,x d ) = JJ {S p (x 1 ,x 2 , • • • , x d ) ap ) • {s(x 1 ,x 2 , • • • , x d )t(x 1 ,x 2 , • • • , x d )) n 
P =i 

where the a p are commuting indeterminates, and where the S p , s and t are elements of 
C[xi,x 2 , ■■■ ,x d ]. 

Then there exists a non-negative integer L, there exist L + 1 polynomials in n, 
eo(n), ei(n), • • • ,ei(n) ; not all zero, and there exist d rational functions Ri(n;xi,x 2 ,- ■■ x d ) 
(with i = 1,2, ■ ■ ■ ,d) such that the functions 

Gi(n; xi,x 2 , ■ ■ ■ x d ) := Ri(n; xi,x 2 , ■ ■ ■ x d )F(n; xi,x 2 , ■■■ ,x d ) 

satisfy the equations 

L d 

^ ei(n)F{n -\-i;x\,x 2 , ■■■ ,x d ) = ^ j D x fii(n; x x ,x 2 , ■ ■ ■ x d ). (7) 



i=0 



Furthermore, there exists a constant N = N(deg(s), deg(t),Y^p=i deg(S p )) such that 
L < N and deg(ei) < N, and therefore, the polynomials ei(n) can be explicitly computed. 

Note that Theorem [2l] eliminates the need for "guessing" the polynomial recurrence 
satisfied by the functions F{n + i;x\, x 2 , • • • , x d ). Indeed, because of the existence of 
the upper bound for the number and maximum degree of the polynomials ei(n), 



finding the polynomials ei{n) in (7 ) is simply the question of solving a (possibly huge) 



system of linear equations. Furthermore, by Theorem 21 we know that a polynomial 
recurrence relation ( [7~| ) exists, so once we have enough equations in our system to obtain 
a unique solution for the vector e(n) = (eo(re), ei(n), ■ ■ ■ ,ei(n)), we can be sure that 
the polynomial recurrence relation defined by e(n) is indeed a recurrence relation that 



is satisfied by the F(n + i; x\,x%, ■ ■ ■ ,x d ) for all n. So Theorem 21 does take care of 



both problems we had with simply guessing a polynomial recurrence relation. 
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Now apply Theorem 21 to Example [18] by letting d = 2, x\ = x, and X2 = y. Set 
Si(x,y) = 1 — 2x — 2y + y 2 , with a% = 1/2, and, crucially, Sz(x,y) = xy with cti = — 1. 
Finally, set s(x, y) = 1 and y) = xy. This leads to 

tti/ s y 7 ! - 2x - 2y + y 2 

F ( n ; x ' y) = — x n+i y n+i — • ( 8 ) 

Consider the right-hand side as a power series in two variables and integrate both sides 



of (8 ) on a two-dimensional polydisk whose interior contains 0. By the two- variable 
residue theorem, we see that the only summand in the right-hand side whose integral 
does not vanish is a n ,n\ ■ \ ) - In fact, by the residue theorem, we get 



J F(n,x,y) = -47T 2 a n . 



Therefore, the polynomial recurrence relation (7 ) is equivalent to a polynomial 
recurrence relation for the numbers a n>n . In order to obtain this recurrence relation, 
we need to solve a large system of linear equations. Fortunately, Doron Zeilberger's 
software package, SMAZ [3] can do that for us. The result is the following theorem. 



Theorem 22. Let a n = a n ^ n be the coefficient of z n in (5 ). Recall that a n counts 



assembly trees of graphs studied in Example \Ts[ Then we have do = 0, oi = 1, and 

3 (3n-l)(3n + l) 

an+i - 2 — ^n? — an - (9) 

(Note that the above discussion computes a recurrence relation for the coefficients 
of diagy 7 1 — 2x — 2y + y 2 and not diag(l — \J\ — 2x — 2y + y 2 ), but it is obvious that 
starting with the coefficient of z, the coefficients of these two power series will satisfy 
the same recurrence relation.) 

Similarly, we can apply Theorem [21] to obtain a polynomial recurrence relation for 
the coefficients b n of z n in diagA(z), where A(x, y) is the bivariate generating function of 
the number of assembly trees of complete bipartite graphs as computed in Example [T9] 
We assign the same values to the various parameters as we did immediately preceding 



(8 ), except that we set S%(x, y) = x 2 + y 2 — 2x — 2y + 1. The result is the following. 
Let [2: n ]y(z) denote the coefficient of z n in the power series g(z). 

Theorem 23. Let b n = [z n ]diag (l - ^/x 2 + y 2 - 2x - 2y + l) . Note that b n is the 

number of assembly trees of K n ^ n , the graph studied in Example 19 Then we have 
bo = 0, b\ = 1, ^2 = 5/2, and 

2(6n 2 + 12n + 5), n(2n - l)(2n + 3) 

bn+2 - 7 b n+ i -; , 2 , — T— n (1UJ 

(n + 2) [n + 2y(n + 1) 

if n > 1. 

Finally, let c n = [i n ]diagA(t), where A(x,y,z) is the trivariate generating function 
of the number of assembly trees of complete tripartite graphs as computed in Example 



20 We can then use Theorem 21 with d = 3, x% = x, x% = y, X3 = z. Set Si(x, y, z) 



A(x, y, z) as defined in Example 20 with a\ = 1/2, and ^(x, y, z) = xyz with ol<i = —1. 



Finally, set s(x,y,z) = 1, and t(x,y,z) = xyz. We then get the following result. 
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Theorem 24. Let c n = [i n ]diagA(t), where 

A(x, y,z) = l- V / (l-x) 2 + (l-y) 2 + (l-z) 2 -2. 

Then we have cq = 0, c\ = 3, C2 = 84, C3 = 4935, and 

c n +3 = r2{n)c n +2 + n(n)c„+i + r (n)cn, (11) 

if n > 1. ifere £/ie are explicitly known rational functions of n, with numerators and 
denominators of degree 11 for rQ, degree ten for r\, and degree nine for r<i- 

4.4 Finding Growth Rates 

As far as determining the growth rate of the sequence a\ , a?, , ■ ■ ■ , recurrence relation ([9] 
|) is much more useful than the explicit formula that we found for a n in (pT|). Indeed, the 



exponential growth rate of a n is easy to read off from (9 ). It is routine to prove that 



lim sfa^ = 13.5. 

71— >CO 

Determining the growth rate of a n at a higher level of precision is much more difficult. 
The theoretical foundation of this computation is the paper [10J by Doron Zeilberger 
and Jet Wimp. In that paper, the authors consider a more general setup, but in the 
examples we study, their method simplifies to the following. 

Let t n be a sequence for which a polynomial recurrence relation is known, and of 
which we want to compute the asymptotics. Try to obtain t n in the form 

t n = E n K n , 

where 

j^^ _ e /jo«.lnn+/jinlnn n 

and 

K n = exp a.\n p + p + ct^n p 



Here a\ 7^ 0, /3 = j/p, and < j < p. This decomposition leads to the formula 



tn V n J \ V P 

where A = e ^ 0+ ^. Then, using the polynomial recurrence relation for t n , determine 
the parameters in E n and K n , obtaining this way an exact formula (in the form of 
an infinite sum) for t n . This computation can certainly be long and teadious, but the 
software package AsyRec [11] can carry it out. For the sequence a n , we obtain the 
following result. 



Theorem 25. Let a n be defined as in Theorem 22. Note that a n is the number of 



assembly trees of the graph discussed in Example 18, Then we have 



13.5 n / 1 5 
n 2 \ 9n 81n 2 



In particular, a n ~ 13 n % " , and therefore, by Theorem 12, diagA( H ,^)(z) is not algebraic 
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Applying the same method for the polynomial recurrence relation proved for the 
numbers b n in Theorem |23[ we get the following asymptotic expressions. 



Theorem 26. Let b n be defined as in Theorem 23, that is, let b n be the number of 



assembly trees of the complete bipartite graph K n ^ n , which was studied in Example 19 
Then we have 



b n 



(6 + 4V2) r 



1 



n- 



35 
8n 



5 V2 
32 V 



In particular, b n 



(6+4 v / 2) r ' 

3 

TJ - 



, and therefore, by Theorem 12, diagA Xjy (z) is not algebraic. 



5 Questions 

There are reasonable alternatives to the edge gluing rule for defining an assembly tree. 
Two possibilities are the following. For each of these two rules, the problem is again to 
enumeratate the number of assembly trees for interesting graphs. 



Definition 27. An assembly tree for a connected graph G using the connected gluing 
rule is an assembly tree for G (satisfying properties (1-4) in Section [TJ, and also satisfying 
the additional property: 

5. For each node, the graph induced by the vertices in the label is connected. 



This rule is less restrictive than the edge gluling rule. In particular, an assembly tree of 
a graph G is not necessarily a binary tree. At the extreme is the assembly tree of depth 
1 for which every vertex of G is a child of the root. 



Definition 28. In this definition, we denote each face of a plane graph by the set of 
vertices on that face. An assembly tree for a connected plane graph G using the face 
gluing rule is an assembly tree for G satisfying the additional property. 

5. For each internal node U there is a face F such that CnF^0 for each C G c(U) 
and F C |J c(U). 

Figure [5] shows a plane graph and one assembly tree using the face gluing rule. 

All the graphs in Sections [2] and [3] for which we were able to compute the number of 
assembly trees share a common property: the number of connected induced subgraphs, 
up to unlabeled isomorphism type, is small. For example, the number of connected 
induced subgraphs of the path P n is n. For the complete graph K n , it is n, and for 
the complete bipartite graph K m<n it is mn. With (H, eft) fixed, the number of con- 
nected induced subgraphs of any (H, c/>)-graph is again polynomial in the parameters 
nx, ri2, . . . , nj\r- For results on graphs with few isomorphism types of induced subgraphs 
see [TJ and [5]. So the question arises: is it possible to enumerate the number of assembly 
trees (by the edge gluing rule) for a family of graphs for which the number of isomorhism 
types of induced subgraphs is not small. In particular, consider the family of caterpillar 
graphs D n on 2n vertices as shown in Figure [5] The number of isomorphism types of 
linduced subgraphs is clearly large, exponential in n. 
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Figure 4: A plane graph and an assembly tree using the face gluing rule. 



Question. Is there a reasonable enumeration of the number of assembly trees for the 
family D n l 



t f * f f • f 



* i 



Figure 5: Caterpillar graph Dy 
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